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ABSTRACT 

DNA repeats constitute potential sites for the nucle- 
ation of secondary structures such as hairpins and 
cruciforms. Studies performed mostly in bacteria 
and yeast showed that these noncanonical DNA 
structures are breakage-prone, making them candi- 
date targets for cellular DNA repair pathways. 
Possible culprits for fragility at repetitive DNA se- 
quences include replication and transcription as 
well as the action of structure-specific nucleases. 
Despite their patent biological relevance, the par- 
ameters governing DNA repeat-associated chromo- 
somal transactions remain ill-defined. Here, we 
established an episomal recombination system 
based on donor and acceptor complementary DNA 
templates to investigate the role of direct and 
inverted DNA repeats in homologous recombination 
(HR) in mammalian cells. This system allowed us 
also to ascertain in a stringent manner the impact 
of repetitive sequence replication on homology- 
directed gene repair. We found that nonspaced 
DNA repeats can, per se, engage the HR pathway 
of the cell and that this process is primarily de- 
pendent on their spacing and relative arrangement 
(i.e. parallel or antiparallel) rather than on their 
sequence. Indeed, our data demonstrate that 
contrary to direct and spaced inverted repeats, 
nonspaced inverted repeats are intrinsically 
recombinogenic motifs in mammalian cells lending 
experimental support to their role in genome 
dynamics in higher eukaryotes. 

INTRODUCTION 

The genomes of prokaryotes and eukaryotes harbor 
numerous and diverse types of repetitive DNA sequences 



many of which have been associated with genome evolu- 
tion, regulation of gene expression and chromosomal re- 
arrangements underlying a number of inherited disorders 
and certain translocation-bearing tumors. These motifs 
include single direct and inverted DNA repeats with or 
without internal spacers as well as high-copy-number 
tandem tracts (1-4). Accumulating evidence indicates 
that DNA repeats can adopt different noncanonical (i.e. 
non-B) DNA conformations depending on a number of 
intrinsic parameters. These include the nucleotide compos- 
ition, the length and the relative orientation of the con- 
stituent DNA units as well as their spacing and extent of 
sequence identity. Extrinsic factors such as the torsional 
strain associated with DNA metabolic processes, 
chromatinization and transcription are also thought to 
influence the likelihood that DNA repeats acquire 
higher-order conformations. DNA conformers have been 
implicated in both physiological and pathological 
processes including the regulation of DNA replication 
and expression, oncogenic chromosomal rearrangements 
and gene amplification (1-7). Related to this, palindromes 
(i.e. uninterrupted or nonspaced inverted DNA repeats) 
and inverted DNA repeats with relatively short central 
spacers, can, via local negative superhelical stress and 
ensuing intrastrand hybridization and branch migration, 
extrude into four-way Holliday junction-like DNA struc- 
tures or cruciforms. Inverted DNA repeats in single- 
stranded form may also originate stem-loops or hairpins 
via intrastrand annealing. This may, for instance, occur 
when the unwinding of double-helical DNA during repli- 
cation creates a lagging strand template. 

The rearrangement of chromosomal DNA carrying 
noncanonical structures are likely preceded by and de- 
pendent on phosphodiester bond cleavage presumably 
via their resolution and processing by structure-specific 
nucleases. This might occur in concert with DNA 
replication-associated phenomena such as replisome 
stalling or slippage. In Escherichia coli, physical evi- 
dence was recently obtained for the emergence of 
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double-stranded DNA breaks (DSBs) at a 246-base pair 
(bp) palindrome via the combined effects of DNA repli- 
cation and cleavage by the Mrel 1/Rad50 homolog SbcCD 

(8) . Interestingly, studies carried out in a yeast model sys- 
tem revealed that inverted repeats of a 320-bp 
retrotransposon-derived human Alu sequence inserted 
into a LYS2 reporter allele were a target for the Mrell/ 
Rad50/Nbsl complex suggesting evolutionary conserva- 
tion of DNA structure-processing biochemical pathways 

(9) . Moreover, in these prokaryotic and lower eukaryotic 
model systems, reporter gene expression rescue assays 
showed that long (i.e. > 150 bp) inverted repeat-associated 
DNA breaks could engage the error-free homologous re- 
combination (HR) pathway (3). Notwithstanding steady 
progress in this field, many questions remain with respect 
to the relationships between specific parameters of repeti- 
tive DNA motifs, putative ensuing higher-order DNA 
conformations and the recruitment of cellular pathways 
that regulate genetic recombination. This knowledge gap 
is particularly acute in cells of higher eukaryotes (2,3). In 
addition, hitherto, the vast majority of studies on the bio- 
logical activity and fate of repetitive DNA in vivo focused 
on endogenous or exogenous test sequences embedded 
within the chromosomal DNA of dividing cells. With 
these types of experimental setups, it is difficult to assess 
a possible contribution of template DNA replication to 
repeat-associated phenomena. 

Here, we developed and deployed an extrachromosomal 
recombination system to specifically address the role of 
single DNA repeats of different sequence, arrangement 
(i.e. parallel or antiparallel) and spacing in HR-mediated 
DNA repair in mammalian cells. Furthermore, the intro- 
duction of a eukaryotic origin of replication into the re- 
petitive DNA-containing episomes allowed us to also 
investigate the impact of target template DNA replication 
on the recombinogenic potential of the various motifs. We 
demonstrate that simple palindromes and composite 
inverted DNA repeats, but not direct or spaced inverted 
DNA repeats, serve as targets for the error-free HR repair 
pathway in mammalian cells and that this process is inde- 
pendent of ongoing DNA repeat-bearing molecule 
replication. 

MATERIALS AND METHODS 

Cells 

HeLa cells [American Type Culture Collection (ATCC)], 
human embryonic kidney (HEK) 293T cells (ATCC) and 
91 1 cells (10) were cultured in Dulbecco's modified Eagle's 
medium (DMEM; Invitrogen) supplemented with 5% 
fetal bovine serum (FBS; Invitrogen). PER.tTA.Cre76 
cells (11) and COS-7 cells (ATCC) were propagated in 
DMEM supplemented with 10% FBS. All cells were 
cultured at 37°C in an atmosphere of 10% C0 2 in 
humidified air. 

Recombinant DNA 

Plasmid pAl.GFP.A2 has been described previously 
(GenBank accession number: GQ380658) (12). An 
XmaJI recognition sequence was introduced in 



pAl.GFP.A2 at nucleotide positions 620 through 625 of 
the humanized Renilla reniformis green fluorescent protein 
(GFP) open reading frame (ORF) by polymerase chain 
reaction (PCR) site-directed mutagenesis to generate the 
acceptor plasmid pR6K. GFP. STOP. Moreover, 
pR6K.GFP.STOP has the GFP ORF disrupted by an 
amber stop codon (Figure 1). The nucleotide sequences 
of the sense and antisense primers used for introducing 
the mutation that created the XmaJI site were 5'-GAAG 
ACCTAGGTGGAGGAC-3' and 5'-GTCCTCCACCTA 
GGTCTTC-3', respectively (point mutation is underlined) 
and the PCR was carried out with Phusion 
High-Fidelity DNA polymerase (Finnzymes) according 
to the instructions provided by the manufacturer. 
Oligodeoxyribonucleotides used for the introduction of 
DNA sequences into the XmaJI site of pR6K. 
GFP.STOP were 5'-CTAGGAGCGAGCGAGCGAGC 
GAGCGAGCGCCGAGCCCCAACTAGT-3' and 5'-CT 
AGACTAGTTGGGGCTCGGCGCTCGCTCGCTCGC 
TCGCTCGCTC-3' (DR/IR.l), 5'-CTAGGAAGGCGCG 
AGGGACCGCCGAGCAGGCGAGCCCCAACTAGT 
-3' and 5'-CTAGACTAGTTGGGGCTCGCCTGCTCG 
GCGGTCCCTCGCGCCTTC-3' (DR/IR.2), 5'-CTAGA 
GACGACGCAGCGAGCGAGCGAGCGCCACCGAC 
GCACTAGT-3' and 5'-CTAGACTAGTGCGTCGGTG 
GCGCTCGCTCGCTCGCTGCGTCGTCT-3' (DR/ 
IR.3) and 5'-CTAGGAAGGCGCGAGGGAGGGACC 
GCCGAGCAGGCACCGACGCACTAGT-3' and 5'-CT 
AGACTAGTGCGTCGGTGCCTGCTCGGCGGTCCC 
TCGCGCCTTC-3' (DR/IR.4). Insertion of a recognition 
sequence for the meganuclease I-Scel into the XmaJI site 
of pR6K. GFP. STOP was accomplished using the 
oligodeoxyribonucleotides 5'-CTAGGAAGTTACGCTA 
GGGATAACAGGGTAATATAGACTAGT-3' and 
5'-CTAGACTAGTCTATATTACCCTGTTATCCCTA 
GCGTAACTTC-3'. To generate pR6K.GFP.STOP de- 
rivatives containing DNA repeats in a head-to-tail 
(direct repeat) or tail-to-tail (inverted repeat) configur- 
ation, the constructs carrying single copies of the 
oligodeoxyribonucleotide pairs corresponding to DR/ 
IR.l, DR/IR.2, DR/IR.3 and DR/IR.4 were linearized 
with Bcul and subsequently subjected to a second round 
of oligodeoxyribonucleotide cloning. Restriction fragment 
size analysis was used to distinguish between recombinant 
plasmids carrying a direct or an inverted repeat of 
each oligodeoxyribonucleotide pair. To disrupt the palin- 
drome in the IR.l -containing pR6K. GFP. STOP deriva- 
tive acceptor 111 ' 1 (Figure IB and C) at the center of 
symmetry, the plasmid was digested with Bcul and its 
backbone was combined with the oligodeoxyribo- 
nucleotide pair containing the I-Scel recognition 
sequence (ScR). The resulting construct was designated 
acceptor sp R1 . To create an acceptor plasmid, in which 
the GFP ORF is interrupted by the composite adeno- 
associated virus type 2 (AAV) inverted terminal repeat 
(ITR), AAV vector shuttle plasmid pDD2 (13) was 
digested with PvuII and BspLI. The resulting 127-bp 
AAV ITR-specific DNA fragment was inserted into the 
XmaJI site of pR6K.GFP.STOP following its blunt- 
ending with the Klenow fragment of E. coli DNA poly- 
merase I (Klenow, Fermentas) to produce acceptor IT . 
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Figure 1. Episomal recombination system to study the effects of repeti- 
tive DNA sequences on homology-directed gene repair in mammalian 
cells. (A) The bipartite system consists of acceptor and donor plasmids 
with differently disrupted GFP ORFs. In the acceptor plasmids, the 
GFP ORF is interrupted by an amber stop codon (asterisk) plus a 
test DNA sequence of choice (large red bar); whereas in the donor 
plasmid, it is rendered nonfunctional by the deletion of its first 38 
nts (short red bar). The interrupted GFP ORF in the various 



Plasmid pUC.hrGFPI.SV40pA was prepared by 
introducing the GFP ORF and the downstream bidirec- 
tional simian virus 40 (SV40) polyadenylation signal (pA) 
derived from pAl.GFP.A2 into pUC19 using BamHI 
and Xbal. Next, the pA of the rabbit fi-globin (fiG) gene 
was inserted immediately upstream of the GFP-coding 
sequence in pUC.hrGFPI.SV40pA to inhibit possible 
spurious transcription of the GFP ORF due to the 
presence of cryptic promoters in the plasmid backbone. 
To this end, pAAV.hEFla.DsRedT4.rbGpA, an AAV 
vector shuttle construct containing a DsRed.T4 (14) ex- 
pression unit controlled by the human eukaryotic transla- 
tion elongation factor 1 alpha 1 {EFla) gene promoter and 
PG pA, was incubated with NotI and Smil, the digestion 
products were blunt-ended using Klenow and the 587-bp 
PG pA-containing DNA fragment was purified from 
agarose gel. Subsequently, this fragment was inserted in 
the proper orientation into pUC.hrGFPI.SV40pA 
between the Klenow-blunted Xbal and Hindlll sites to 
produce pUC.donor.rbGpA.hrGFPI.SV40pA. Deletion 
of the GFP start codon from pUC. donor. rbGpA 
.hrGFPI.SV40pA was achieved by digesting the plasmid 
with Sail and Sdal, filling in the 3' recessed ends with 
Klenow and self-ligation of the plasmid backbone 
to create the donor template pUC. donor. GFP. AATG 
(i.e. GFP AATG ; GenBank accession number: JF714898). 
Construct pUC.donor.rbGpA.DsRed.T4.SV40pA was 
made by replacing the GFP ORF in pUC. donor. rbGpA 
.hrGFPI.SV40pA by the coding sequence of the red fluor- 
escent protein (RFP) DsRed.T4. The DsRed.T4-coding 
sequence was excised from pAAV.hEFla.DsRedT4 
.rbGpA using Xbal and NotI and subsequently 
combined with the 3.5-kb XbalxNotI fragment of 



acceptor plasmids is framed by a constitutively active (broken arrow) 
human eukaryotic translation elongation factor 1 alpha 1 (EFla) gene 
promoter and the SV40 pA (SV.pA), whereas that in the donor plasmid 
is preceded by the rabbit fi-glohin pA (FiG.pA) and followed by the 
SV40 pA. The prokaryotic origins of replication R6K and ColEl as 
well as the antibiotic resistance genes aminoglycoside 
3' '-phosphotransferase (Kan R ) and ^-lactamase (Amp R ) present in the 
acceptor and donor plasmid backbones, respectively, are also indicated. 
Once introduced into cells, donor and acceptor plasmids are candidate 
substrates for HR by virtue of the shared 339- and 584-bp DNA 
sequences. Reciprocal exchange of genetic information between 
acceptor and donor templates via cross-overs within these homologous 
regions is expected to give rise to transcription units with restored 
ORFs directing the synthesis of full-length GFP. (B) Three-step 
strategy to generate the various acceptor plasmids with test sequences 
1 through 4 arranged in a direct or inverted repeat orientation (DR and 
IR series, respectively). Step 1: PCR-based site-directed mutagenesis of 
the C residue at position 624 of the GFP ORF into a G generates a 
premature stop codon and a XmaJI recognition site. Step 2: Insertion, 
at the XmaJI site, of test sequences 1, 2, 3 or 4, which all contain a 
Bcul recognition sequence at their 3'-end. Step 3: Molecular clones with 
the Bcul sites in a position distal to the premature stop codon (orien- 
tation depicted) were used to duplicate test sequences 1, 2, 3 or 4. 
Acceptor plasmids with the meganuclease I-Scel recognition site or 
containing a single copy of target sequence 1 served as controls. (C) 
Schematic representation of directed and inverted repeats of DNA se- 
quences 1 through 4, the I-Scel recognition site and a single copy of 
test sequence 1. The propensity of each DNA sequence to transit from 
lineform to cruciform by intrastrand Watson and Crick base pairing 
was estimated by calculating the Gibbs free energy (AG) in the 
presence of 150mM NaCI or of 1M NaCI. 
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pUC.donor.rbGpA.hrGFPI.SV40pA. Disruption of the 
RFP ORF in pUC.donor.rbGpA.DsRed.T4.SV40pA 
was accomplished by linearization with Ncol followed 
by Klenow treatment and self-ligation resulting in 
nonhomologous donor plasmid pUC. donor. RFP. AATG 
(i.e. RFP AAtG ; GenBank accession number: JF7 14899). 
The I-Scel expression construct pCAG.I-Scel (GenBank 
accession number: JF714900) was made by inserting the 
3.1-kb SallxPstI fragment of pCbASce (15) into the 
polylinker of pUC19 after its digestion with Sail and 
Pstl. The expression plasmid pCAG.I-SceI(Al 12-246) 
encoding a nonfunctional version of I-Scel was generated 
by digesting pCAG.I-Scel with BstBI followed by 
self-ligation of the resulting vector backbone. All DNA 
preparations were made by using the JetStar 2.0 DNA 
isolation system (Genomed). 

To provide acceptor ScR , acceptor DR1 , acceptor IR ' 1 and 
acceptor spIR 1 with an SV40 origin of replication (ori), 
they were linearized with Ndel and blunt-ended using 
Klenow. Next, these linear DNA molecules were ligated 
to the SV40 ori-containing 323-bp PvuIIxEcol47I 
fragment of pGL4.22 (GenBank accession number: 
DQ188842), yielding acceptor ScRORI , acceptor DR1 ° RI , 
acceptor IR1 ORI and acceptor spIR 1 ORI . All oligodeoxyri- 
bonucleotides were supplied by Eurofms MWG Operon, 
while the restriction and DNA modifying enzymes were 
from Fermentas. 

Gibbs free energy calculations 

The Gibbs free entry of the most stable secondary struc- 
ture that can be folded by each of the DNA segments 
inserted into the XmaJI site of pR6K.GFP.STOP was 
calculated with the aid of the software program Mfold 
3.2 (16) using energy rules for DNA (17) at http://mfold 
.rna.albany.edu/?q = mfold/DNA-Folding-Form. 

Extrachromosomal DNA extraction 

Extrachromosomal DNA was extracted from the trans- 
fected cells essentially as described before (18). Briefly, at 
72 hours post-transfection, cells were scraped from the 
surface of a 2-cm 2 well with the plunger of a 1-ml 
Luer-Lok disposable syringe (BD Biosciences). The cell 
suspension was collected in a 15-ml screwcap tube with 
a conical bottom (Greiner Bio-One) and centrifuged for 
5min at 1500g. The supernatant was aspirated and the 
cells were washed once with 5 ml phosphate-buffered 
saline. After another round of centrifugation, the cell 
pellet was resuspended in 180 pi of solution I [10 mM 
Tris-HCl at pH 8.0; 10 mM EDTA; lOOpg/ml proteinase 
K (Fermentas)] and transferred to a 1.5-ml microtube 
(Eppendorf). Next, 180 p.1 of solution II (10 mM Tris- 
HCl at pH 8.0; 10 mM EDTA; 1.2% sodium dodecyl 
sulfate) was added. The microtube was inverted thrice to 
mix its content and incubated for 30min at 37°C. Next, 
the sample was mixed with 90 ul of 5 M NaCl and stored 
overnight at 4°C. The following day, the chromosomal 
DNA was pelleted by centrifugation for 60min at 
16 100g and the supernatant was removed to a new 
microtube. Subsequently, the supernatant was extracted 
twice with buffer-saturated phenol:chloroform:isoamyl 



alcohol (25:24:1) and once with chloroform and the 
episomal DNA was precipitated by the addition of 2.5 
volumes of ethanol and 0.5 volumes of 7.5 M 
ammonium acetate at pH 5.5. After washing with 70% 
ethanol, the DNA pellet was dried and dissolved in 
100 ul of TE + buffer [10 mM Tris-HCl at pH 8.0; 1 mM 
EDTA; lOOug/ml RNase A (Fermentas)]. The purified 
DNA was used for PCR and Southern blot analyses. 

In vivo assay to detect processing of inverted DNA 
repeats 

A total of 80 000 HeLa cells were transfected essentially as 
described under "DNA transfections" (Supplementary 
'Materials and Methods' section) with 266 ng of 
acceptor DR1 , acceptor 111 ' 1 or acceptor ScR each mixed 
with 133 ng of pCAG.I-Scel or of pCAG.I-Scel 
(A 112-246). Extrachromosomal DNA was isolated 72 
hours post-transfection as described elsewhere in this 
section and PCR was performed on 4 pi of DNA using 
0.4 uM of primers 1 (5'-ATGGTGAGCAAGCAGATCC 
TGAAG-'3) and 2 (5'-CCGAGAAGGAAGTGCTC 
C-3'), 0.4 mM of each dNTP (New England Biolabs), lx 
GoTaq reaction buffer, 1 mM MgCl 2 and 2.5 U of GoTaq 
DNA polymerase (all from Promega) in a final volume of 
50 pi. The PCR cycles were performed in a DNA Engine 
Tetrad 2 Peltier Thermal Cycler (Bio-Rad) using the fol- 
lowing cycling conditions. A first denaturating step at 
95°C for 5min, followed by 30 cycles of 60s at 95°C, 
60s at 64°C and 120s at 72°C. Reactions were terminated 
by a final extension period of 5min at 72°C. The 
synthesized DNA was purified using SureClean (Bioline) 
and dissolved in 30 pi of 10 mM Tris-Cl, pH 8.0. The re- 
sulting PCR products were treated with Dpnl alone or 
with Dpnl plus I-Scel, XmaJI or Bcul and, subsequently, 
subjected to agarose gel electrophoresis. The inclusion of 
Dpnl served to remove possible residual input prokaryotic 
plasmid DNA before Sourthern blot analysis. The proced- 
ures for Southern blotting and for DNA probe radiolabel- 
ing are described is the Supplementary 'Materials and 
Methods' section. The probe used is complementary to 
the GFP ORF and contiguous SV40 pA sequences. 
Undigested PCR products were cloned in the 
pCR4-TOPO cloning vector (invitrogen) using GT115 
chemically competent E. coli cells. Individual molecular 
clones corresponding to independent DNA processing 
events were sequenced using primers 1 or 6 (5'-CAGCT 
TCGAGGTGGTG-3'). 

Statistical analysis 

Statistical parameters were computed using Graph Pad 
Prism 4.03. Student's /-test was applied to compare data 
sets with P < 0.05 being considered significant. 

RESULTS 

Design of the episomal HR assay system 

We established an extrachromosomal assay system to 
study the role of DNA repeats in homology-directed 
gene repair in vivo. This system is based on pairs of 
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recombination substrates consisting of donor and 
acceptor (or target) DNA molecules containing GFP 
reporter genes that are rendered defective by different 
means (Figure 1A). The transcription unit in the donor 
plasmid was made nonfunctional by removing the first 38 
nucleotides from the GFP ORF and lacks eukaryotic 
promoter/enhancer elements, whereas the GFP ORF in 
the various acceptor constructs was disrupted by an 
internal stop codon preceding the test DNA sequences 
depicted in Figure IB and C. The donor and acceptor 
DNA templates share two regions of perfect sequence 
identity (Figure 1A, marked in gray), which in the 
acceptor plasmids are separated from each other by the 
test sequences. HR-dependent reciprocal exchange of 
genetic information through the common DNA 
segments of 584 and 339 bp or via a single cross-over 
within the upstream, 584-bp, arm of homology are both 
expected to generate fully functional GFP transcription 
units. Thus, the capacity of different test DNA sequences 
to elicit homology-directed gene repair can be readily 
assessed by the analysis of GFP expression through 
direct fluorescence microscopy or flow cytometry. 

The generation of site-specific DSBs by the 
Saccharomyces cerevisiae mitochondrial group I 
intron-encoded homing endonuclease I-Scel at its 
cognate 18 bp recognition sequence is a well-established 
method to trigger in a predictable manner DNA re- 
pair pathways in prokaryotic and eukaryotic systems 
(19). Therefore, in our experimental system, we used as 
positive control for HR-dependent rescue of GFP expres- 
sion, cells exposed to a donor plasmid, an acceptor 
template containing the ScR and an I-Scel expression con- 
struct. Cells that only received the first two plasmids 
served as negative control in the HR assay. 

Experimental evidence for inverted repeat-dependent 
formation of DNA secondary structures in acceptor 
plasmids 

The bacteriophage T7 endonuclease I is a commonly used 
tool to probe for the presence of cruciform- or 
Holliday-like secondary structures in DNA. This resolvase 
recognizes preferentially the four-way junction character- 
istic of these types of DNA conformers and, in its dimeric 
form, introduces paired nicks close to the branch points 
leading to subsequent DNA cleavage. On the other hand, 
negative superhelical torsional stress, such as present in 
the supercoiled (SC) fraction of plasmid DNA, is a 
major driving force in the nucleation and extrusion of 
cruciform-like structures at DNA repeats. Thus, the 
E. coli enzyme DNA topoisomerase I by catalyzing the 
relaxation of negatively SC DNA should, to a great 
extent, inhibit the generation of cruciforms as scored in 
in vitro assays based on the accumulation of T7 endo- 
nuclease I-resolved plasmid DNA molecules (i.e. linear 
form). The results presented in Figure 2A correspond to 
an experiment in which the validated cruciform-forming 
plasmid pUC(AT) was used to confirm that the 
time-dependent accumulation of the linear T7 endonucle- 
ase I-derived DNA product does indeed depend on SC 
DNA. In fact, in the presence of the DNA topoisomerase 



I, virtually all the pUC(AT) plasmid transited from the SC 
to the open circular (OC) form resulting in a concomitant 
stringent inhibition of cruciform resolution. Importantly, 
an equivalent experimental outcome was observed when 
the DNA substrate harboring the test IR.l sequence was 
deployed, suggesting that this plasmid is prone to the ac- 
quisition of cruciform-like structures as well (Figure 2B). 

Next, to investigate the ability of different tandem ar- 
rangements of the test DNA sequences to induce the nu- 
cleation and extrusion of cruciform-like structures in 
plasmid substrates, the ScR-, IR.l- or DR. 1 -containing 
acceptor constructs were treated or mock-treated with 
T7 endonuclease I. For the sake of clarity, hereinafter 
these acceptor plasmids will be named after the test se- 
quences that they contain (e.g. acceptor ScR , acceptor 114 ' 1 , 
etc). Because acceptor ScR purposely possesses a test DNA 
sequence with an intrinsically low intrastrand folding 
capacity (AG = — 0.35 kcal/mol) it served to establish 
the background of the assay. To provide for a test DNA 
sequence known to give rise to four-way DNA structures 
under physiological conditions, we generated acceptor ITR , 
which has a 127-bp DNA segment derived from the AAV 
ITR inserted at GFP nucleotide position 624. This DNA 
segment, which contains three self-complementary regions 
(i.e. A/A, B/B' and C/C), has a high propensity to fold 
into a T-shaped hairpin structure by intrastrand hybrid- 
ization (20,21). Thus, the acceptor 1 R plasmid was also 
subjected to T7 endonuclease I treatment. Furthermore, 
to generate internal references, each acceptor plasmid was 
linearized in parallel with the restriction enzyme ApaLI. 

Agarose gel electrophoresis of T7 endonuclease I- or 
ApaLI-treated acceptor plasmids revealed that the frag- 
ments with sizes consistent with DNA cleavage (i.e. 
linearized DNA forms) were clearly more prominent in 
the samples of acceptor IR1 and acceptor 1 ™ than in 
those corresponding to acceptor ScR and acceptor DR1 
(Figure 2C). Indeed, there was no noticeable difference 
in the accumulation of linear DNA between the 
non-repeat-containing acceptor ScR and the direct 
repeat-containing acceptor DR1 . As expected, agarose gel 
electrophoresis of acceptor plasmids not exposed to T7 
endonuclease I resulted in the detection of only SC and 
OC (i.e. nicked) DNA topologies (Figure 2C). 

In another set of experiments, we investigated the T7 
endonuclease I-dependent accumulation of linear acceptor 
DNA species as a function of time. To this end, constructs 

IR 1 T)R I i TTR ■ 1 1 

acceptor ' , acceptor ' and acceptor were incubated 
with the resolvase for 10, 20, 30 or 60min or left untreat- 
ed. Analysis of the resulting DNA products by agarose gel 
electrophoresis showed that for each period of T7 endo- 
nuclease I treatment, the inverted repeat-containing 
plasmids acceptor 114 ' 1 and acceptor ITR yielded higher 
amounts of linear DNA than the direct repeat-containing 
construct acceptor DRl (Figure 2D). In fact, exposure of 
acceptor DR1 for 60min to T7 endonuclease I generated 
less linear DNA molecules than a 10-min incubation of 
acceptor or acceptor ' with the same enzyme (Figure 
2D). However, linear acceptor ITR molecules accumulated 
with faster kinetics than linear acceptor 114 ' 1 DNA (Figure 
2D). These data, in agreement with the calculated folding 
free energy values (Figure 1C), indicate that T7 
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Figure 2. In vitro assay to probe for the formation of cruciform-like secondary structures in DNA repeat-containing acceptor plasmids. (A) Testing 
the effect of the negative supercoiling-relaxing enzyme DNA topoisomerase I on the yields of T7 endonuclease I-resolved DNA (solid arrowhead) 
using the validated cruciform-forming plasmid pUC(AT). pUC(AT) treated or not treated with DNA topoisomerase I (+Topo and —Topo, respect- 
ively) was exposed to T7 endonuclease I for 10, 20, 30 or 60min or underwent a 60-min mock treatment (0). Following agarose gel electrophoresis, 
the resulting DNA forms were visualized through ethidium bromide staining. OC and SC, open circular and supercoiled DNA forms, respectively. 
(B) Testing the impact of the negative supercoiling-relaxing enzyme DNA topoisomerase I on the yields of T7 endonuclease I-resolved DNA (solid 
arrowhead) using the acceptor substrate containing the test nonspaced inverted repeat sequence IR.l. Acceptor 11 *' 1 treated or not treated with DNA 
topoisomerase I (+Topo and -Topo, respectively) was exposed to T7 endonuclease I for 10, 20 or 30min or was subjected to a 30-min mock 
treatment (0). OC and SC, open circular and supercoiled DNA forms, respectively. (C) Target DNA plasmids acceptor ScR (ScR), acceptor 11 *" 1 (IR.l), 
acceptor DR 1 (DR.l) and acceptor 1TR (ITR) were incubated in the presence (+) or absence (— ) of T7 endonuclease I (T7) for lOmin. The resulting 
DNA products were resolved by agarose gel electrophoresis and stained with ethidium bromide. OC and SC, open circular and supercoiled DNA 
forms, respectively. L, Acceptor plasmid linearized with ApaLI. (D) Target DNA plasmids acceptor 11 * , acceptor DR 1 and acceptor 1TR were either 
incubated with T7 endonuclease I for 10, 20, 30 or 60min or underwent a 60-min mock treatment (0). The resulting DNA products were analyzed by 
agarose gel electrophoresis and ethidium bromide staining. L, Acceptor plasmid linearized with ApaLI. (E) The test plasmids harboring the 
nonspaced inverted repeat sequences IR.l through IR.4 (left panels) as well as those containing their respective direct repeat counterparts (right 
panels) were treated with T7 endonuclease I for 20, 30 or 60 min and then resolved through agarose gel electrophoresis. SC, supercoiled DNA; solid 
arrowheads point to the resolved linear molecules. (F) The plasmids acceptor 01 *" 1 , acceptor 51 '"*" 1 , acceptor 11 *" 1 and acceptor ITR were incubated with 
T7 endonuclease I for 10, 20, 30 and 60 min and analyzed by agarose gel electrophoresis. (G) In vitro mapping of the T7 endonuclease I cleavage site 
in acceptor molecules. Upper panel, diagram of the expected digestion patterns resulting from the combined activities of Sail and T7 endonuclease I 
or of Hindi and T7 endonuclease I. The numerals correspond to the sizes (in bp) of the different DNA fragments each of which drawn in relation to 
the parental acceptor DNA template containing the ITR sequence embedded within the GFP ORF. Lower left and right panels, agarose gel 
electrophoresis of ITR-containing acceptor molecules treated only with Sail or with Sail and T7 endonuclease I or exposed to Hindi or to 
Hindi plus T7 endonuclease I, respectively. Lanes M in all the panels, GeneRuler DNA Ladder Mix molecular weight marker (Fermentas). 
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endonuclease I-susceptible secondary DNA structures 
(e.g. DNA cruciforms) can be formed after insertion of 
the test DNA sequences as inverted repeats in acceptor 
plasmid substrates (Figure 1C). These experimental 
results further suggest that the AAV ITR has a higher 
propensity to acquire a T7 endonuclease I-sensitive 
DNA conformation than IR.l. Similar experiments were 
also carried out with not only the acceptor substrates har- 
boring IR.l, DR.l and ITR but also with those containing 
each of the other six nonspaced repeat sequences (i.e. 
DR.2, DR.3, DR.4, IR.2, IR.3 and IR.4) as well as that 
with the spaced inverted repeat sequence spIR.l. Results 
from these experiments established that for each of the IR/ 
DR acceptor pairs those substrates containing the 
nonspaced inverted repeat led, at very time point 
analyzed, to a higher accumulation of T7 endonuclease 
I-resolved linear forms than that resulting from those har- 
boring the nonspaced direct repeat (Figure 2E, compare 
each left panel with the corresponding right panel). Of 
note, acceptor spIR 1 with its spaced inverted repeat was 
clearly less prone to T7 endonuclease I digestion than its 
acceptor 1 R 'counterpart (Figure 2F, compare second with 
third panel from the top). Once again, acceptor 1 ™ with its 
multi-palindromic AAV ITR sequence displayed a 
somewhat higher susceptibility to the T7 endonuclease I 
when compared with that of acceptor IR 1 (Figure 2F, 
compare the two lowest panels). Finally, to identify the 
position corresponding to the major T7 endonuclease I 
cleavage site in acceptor DNA backbones with secondary 
structure-forming test sequences, we incubated an 
ITR-containing construct (Figure 2G, upper panel) exclu- 
sively with Sail or with Sail and T7 endonuclease I or 
exposed it to Hindi or Hindi and T7 endonuclease I. 
Agarose gel electrophoresis of the resulting DNA frag- 
ments revealed digestion patterns fully consistent with 
T7 endonuclease I-dependent cleavage of the acceptor 
templates at the location of the inverted repeat (Figure 
2G, lower panels). 

Inverted repeats stimulate DNA exchange through HR 
in mammalian cells 

Next, we deployed the aforementioned episomal HR 
assay system (Figure 1) to ask whether DNA templates 
containing direct or inverted DNA repeats can serve as 
substrates for homology-directed gene repair in mamma- 
lian cells. In these experiments, HeLa cells were either 
mock-transfected or transfected with different plasmid 
combinations. At 4 days post-transfection, HR-dependent 
GFP repair was measured by flow cytometry. Transfection 
of the donor construct GFP AATG , acceptor DR 1 or 
acceptor 114 ' 1 alone did not give rise to measurable 
GFP-speciflc signals showing that the mutations 
introduced in these plasmids did functionally disrupt the 
GFP ORF (Figure 3A). Cotransfection of both GFP AATG 
and acceptor cR resulted in a low percentage of 
GFP-positive cells [i.e. 0.3 ± 0.2% (n = 8); Figure 3A, 
donor + ScR], establishing the background of the assay 
in the presence of both donor and acceptor templates. 
To validate the extrachromosomal recombination 
system, we relied on I-Scel-mediated site-specific DSB 



formation, which is a well-established method to induce 
HR in a controlled and predictable manner [see e.g. ref. 
(19)]. For this purpose, HeLa cells were cotransfected 
with GFP AATG , acceptor ScR and the I-Scel-encoding ex- 
pression plasmid pCAGI-Scel. The inclusion of 
pCAGI-Scel resulted in a large increase in the frequency 
of GFP-positive cells [i.e. 2.5 ± 0.4 % (n = 11); Figure 
3A, donor + ScR + I-SceI]. Interestingly, while 
cotransfection of GFP AATG and acceptor DR1 yielded 
background levels of GFP-positive cells (Figure 3A, 
compare donor + ScR with donor + DR.l), codelivery of 
the donor construct together with acceptor IR 1 gave rise to 
a significantly higher percentage of GFP-positive cells [i.e. 
1.9 ± 0.5 % (n = 11); Figure 3A, compare donor + ScR 
and donor + DR.l with donor + IR.l]. These data imply 
that a tandem of test sequence 1, when arranged in an 
inverted repeat orientation, serves as an effective target 
for homology-directed gene repair. 

To further investigate the relationship between the 
relative orientation of repetitive DNA sequences and 
HR-mediated gene repair, HeLa cells were cotransfected 
with GFP AAT and with an acceptor plasmid containing a 
tandem of test sequences 2, 3 or 4 arranged in either a 
direct or in an inverted repeat configuration. Consistent 
with the previous results (Figure 3A), acceptor plasmids 
endowed with the various inverted repeats yielded signifi- 
cantly higher numbers of GFP-positive cells than their 
isogenic direct repeat-containing counterparts, which 
gave rise to frequencies of GFP-positive cells not signifi- 
cantly above background level (i.e. GFP AA G + 
acceptor ScR ; Figure 3B). Thus, at least for the four differ- 
ent test DNA sequences that were investigated, which 
possess the same GC content; induction of HR is primar- 
ily dependent on the arrangement of the repetitive DNA 
unit as opposed to their specific nucleotide sequence. 

Central spacing abolishes inverted DNA repeat-dependent 
homology-directed gene repair 

To study the impact of repetitive DNA spacing in inverted 
repeat-induced HR, we generated acceptor spIR " . This 
plasmid was made by inserting at the axis of symmetry 
of IR.l, a 42-bp sequence encompassing the ScR to effect- 
ively separate test sequence 1 from its reverse complement 
copy. In these experiments, HeLa cells were transfected 
with donor plasmid GFP AATG , mixed with acceptor ScR , 
acceptor IR 1 or acceptor spIR 1 and, where indicated, 
pCAGI-Scel. As previously observed, cotransfection of 
QppAATG an( j acce pt or 1 resulted in significant 

HR-dependent GFP expression (Figure 3C). However, 
substitution of acceptor 111 ' 1 by acceptor spIR '' yielded 
GFP expression rescue activity levels that were not 
above those detected in cell cultures exposed to 
gfp aatg and acce ptor ScR (Figure 3C). Possibly, 
physical separation of the repetitive DNA unit as in 
acceptor spI 1 inhibits the in vivo formation of 
cruciform-like structures as suggested by our in vitro 
assay results (Figure 2F) as well as those of others 
(22-24), which renders the transfected DNA templates 
no longer a target for the HR machinery. Addition of 
pCAGI-Scel to the transfection mixtures consisting of 
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Figure 3. Effect of repetitive DNA sequences in a direct or inverted 
repeat configuration on homology-directed gene repair in HeLa cells. 
(A) HeLa cells were transfected with acceptor 01 * 1 or acceptor 11 *' 1 alone 
(DR.1 and IR.1, respectively) or with either of these two acceptor 
plasmids in combination with the donor construct GFP AATG 
(donor + DR.1 and donor + IR.1, respectively). Mock-transfected 
HeLa cells (mock) and HeLa cells transfected with GFP AATG alone 
(donor) or together with acceptor Sl;R (donor + ScR) served as negative 
controls. The positive control for the rescue of GFP expression by HR 
was provided by cotransfecting HeLa cells with acceptor ScR , GFP AATG 
and the I-Scel-encoding plasmid pCAG.I-Scel (donor + ScR + I-Scel). 



QppAATG and either acceptor scR or acce ptor spIR 1 rescued 
high-level GFP expression (Figure 3C). The latter outcome 
shows that acceptor spIR 1 does not contain an intrinsically 
defective GFP target template. Taken together, these data 
indicate that perfect palindromes or nonspaced inverted 
DNA repeats are preferred over spaced inverted DNA 
repeats as targets for homology-directed gene repair 
in vivo presumably due to their capacity to form secondary 
structures in vivo that can subsequently serve as direct 
targets for cellular structure-specific nucleases. 

Experimental evidence for in vivo nuclease-mediated 
processing of nonspaced inverted DNA repeats 

In the search of evidence for nuclease-mediated processing 
of palindromic test sequences in vivo, we set-up the 
assay system illustrated in Figure 4A. In this assay, 
HeLa cells are transfected with acceptor IR1 , 
acceptor DR 1 or with acceptor ScR mixed with pCAG.I- 
Scel or with pCAG.I-SceI(Al 12-246). Expression 
plasmid pCAG.I-SceI(Al 12-246) encodes a nonfunctional 
I-Scel protein. Cells cotransfected with acceptor ScR and 
pCAG.I-Scel constitute a positive control for in vivo 
site-specific DSB formation at acceptor templates. A key 
feature of this assay is the fact that a discriminating 
marker in the form of a Bcul recognition site lies at the 
axis of symmetry of the test nonspaced inverted repeat 
sequence (Figures IB and 4A, upper panel). Generation 
of cruciforms at this sequence followed by its recognition 
and processing by cellular structure-specific nuclease(s) 
should result in DNA breaks. The resulting DNA can 
subsequently serve as a substrate for error-prone DNA 
repair processes in the cell, such as nonhomologous 
end-joining (NHEJ), eventually leading to the emergence 
of a population of Bcul-resistant acceptor molecules 
(Figure 4A). Similarly, processing of I-Scel-mediated 
DSBs by a cellular error-prone DNA repair pathway 
should yield templates that are knocked-out in the I-Scel 



Quantification of the number of GFP-positive cells was carried out 
by flow cytometry at 4 days post-transfection. A minimum of 5 and 
a maximum of 1 1 independent experiments were performed with 10 000 
events corresponding to viable cells being measured per sample. (B) 
HeLa cells were cotransfected with GFP AATG plus either acceptor 11 *' 2 , 
acceptor DR 2 , acceptor"* 3 , acceptor 01 *' 3 , acceptor"* 4 or acceptor 01 * 4 . 

HeLa cells 

AATG 



To facilitate comparison, data sets corresponding to 
cotransfected with GFP AATG and acceptor fecR or with GFP 
acceptor ScR and pCAG.I-Scel as well as those corresponding to HeLa 
cells cotransfected with GFP AATG and either acceptor"* 1 or 
acceptor 01 *' 1 presented in Figure 3A are repeated in Figure 3B (open 
and gray bars, respectively). Quantification of GFP expression rescue 
was carried out by flow cytometry at 4 days post-transfection. 
Cumulative data from 4 different experiments (solid bars) are expressed 
as mean ± standard deviation. *P = 0.002, **P = 0.009. (C) Flow 
cytometric analysis of HeLa cells that, in addition to being exposed 
to the donor plasmid GFP AATG also, received acceptor SoR , acceptor 11 *' 1 



or acceptor 



spIR. 1 



In the latter construct, the inverted repeat of test 



sequence 1 is interrupted at its axis of symmetry by an I-Scel recogni- 
tion site (see diagram below the graph). HeLa cells cotransfected with 
GFP AATG , the I-Scel encoding plasmid pCAG.I-Scel and either 
acceptor ScR or acceptor 81 '"*' 1 served as positive controls for 
HR-mediated GFP repair. Data corresponding to a minimum of 
three different experiments are shown as mean ± standard 
deviation. 
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Figure 4. In vivo processing and repair of nonspaced inverted DNA repeats. (A) Diagrammatic representation of the experimental set-up deployed to 
study the processing and repair of nonspaced inverted DNA repeats in cells (see text for details and legend of Figure 1A for an explanation of the 
symbols). (B) Southern blot analysis using an GFP ORF- and SV40 pA-specific probe of PCR products derived from extrachromosomal DNA 
isolated from HeLa cells cotransfected with acceptor ScR and pCAG.I-Scel (+I-SceI), acceptor SoR and pCAG.I-SceI(Al 12-246) (-I-Scel), acceptor 11 *- 1 
or acceptor 01 * '. Before electrophoresis, the DNA samples were treated with Dpnl alone (— ) or with Dpnl together with I-Scel, XmaJI or Bcul. Sizes 
corresponding to undigested amplicons (solid arrowhead) expected for the primer pair 1/2 are indicated on the right of the autoradiograms. 
(C) Upper panel, Diagram of a pCR4-TOPO molecular clone containing a PCR product amplified from extrachromosomal DNA isolated from 
HeLa cells transfected with acceptor"* -1 and treated with Bcul. Vertical arrow points to the original position of the Bcul recognition site. Lower 
panel, Agarose gel electrophoresis of Bcul-treated pCR4-TOPO clones harboring PCR products amplified from episomal DNA extracted from HeLa 
cells transfected with acceptor"*' 1 and exposed to Bcul digestion (lanes 1 through 6). Lane M, GeneRuler DNA Ladder Mix molecular weight 
marker. 



cognate target site. Thus, DNA processing/repair at 
specific test sequences should lead to a mixture of 
GFP templates that can be PCR-amplified and 
discriminated on the basis of sequence-specific enzymatic 
digestions combined with Southern blot and nucleotide 
sequence analysis. Southern blotting of amplicons made 
with the aid of primer set 1/2 (Figure 4A), showed the 
presence of I-Scel-undigested templates in extrachromo- 
somal DNA isolated from HeLa cells cotransfected with 
acceptor ScR and pCAG.I-Scel (+I-SceI; Figure 4B) but 
not in those cotransfected with acceptor ScR and 



pCAG.I-SceI( A 112-246) (-I-Scel; Figure 4B). Impor- 
tantly, the same analysis applied to episomal DNA 
extracted from HeLa cells transfected with acceptor IR1 
or with acceptor DR 1 revealed the presence of 
Bcul-undigested templates in cells exposed to the former 
construct (Figure 4B, lower-left panel). The fact that PCR 
products amplified from acceptor ScR or from acceptor IR 1 
and treated with XmaJI did not yield discernable undigest- 
ed material suggests that the majority of the DNA 
sequence modifications took place in the vicinity of the 
respective I-Scel site or of the IR.l axis of symmetry. 
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Figure 5. Effect of repetitive DNA sequences in a direct or inverted 
repeat configuration on homology-directed gene repair in mammalian 
cells. (A) Flow cytometric analysis of human embryonic kidney 293T 
cells (HEK), human fetal retinoblasts (911 and PER.tTA.Cre76 [PER]) 
and African green monkey kidney fibroblasts (COS-7) cotransfected 
with GFP AATG and either acceptor 01 *" 1 or acceptor"* 1 . Negative and 



Next, amplicons isolated from the Bcul-undigested 
fraction were cloned into pCR4-TOPO and subjected to 
restriction fragment length analysis using Bcul. This 
analysis confirmed the disruption of the Bcul restriction 
site in several of the analyzed clones (Figure 4C, lanes 1, 2, 
4, 5 and 6). Moreover, the apparently different molecular 
weights of the restriction fragments migrating slower than 
1.2kb-sized linear DNA suggest that the Bcul-refractory 
clones do harbor sequences representing the end-product 
of independent DNA processing/repair events. To confirm 
this and to identify the breakpoints in acceptor 1 tem- 
plates at the nucleotide level, we carried out DNA 
sequence analysis of clones 5 and 6 (Supplementary 
Figure SI). As a control, we also sequenced a pCR4- 
TOPO-based clone corresponding to PCR-amplified 
DNA from HeLa cells cotransfected with acceptor ScR 
and pCAG.I-Scel (Supplementary Figure SI). 



Inverted DNA repeat-dependent homology-directed 
gene repair occurs in a variety of mammalian cell types 

Subsequently, we exposed cultures of HEK 293T 
cells, human fetal retinoblasts (911 and PER.tTA.Cre76 
cells) and African green monkey kidney fibroblasts 
(COS-7 cells) to GFP AATG and either acceptor DR1 or 
acceptor 111 ' 1 . Again, negative and positive controls were 
provided by cotransfecting cultures of each of these cell 



AATG 



and 



types with a mixture of the donor plasmid GFP 
acceptor ScR alone or together with pCAG.I-Scel, respect- 
ively. Data depicted in Figure 5A show distinct levels of 
HR-dependent GFP repair in the various cell types tested. 
This might be the result of different transfection 
efficiencies and/or of intrinsic cell type-specific differences 
in the ability to recognize and process, via HR, DNA 
secondary structures. However, importantly, like in 
HeLa cells, appreciable HR-mediated GFP repair was 
only observed after cotransfection of GFP A TG and 
acceptor 1 R 1 (Figure 5 A). This 1R.1 -mediated HR stimu- 
latory effect was independent of the amount of p53 
present in the disparate cell types tested (Supplementary 
Figure S2). Some representative flow cytometry dot plots 
and direct fluorescence microscopy micrographs corres- 
ponding to these experiments are depicted in Figure 5B 
and C, respectively. Collectively, these experiments 



positive controls for HR-mediated GFP repair in each of the tested cell 
types were provided by cells containing GFP AATG and acceptor ScR (— ) 
or these two plasmids as well as pCAG.I-Scel (+), respectively. (B) Dot 
plot representation of GFP expression in human embryonic kidney 
293T cells (HEK) transfected with GFP AATG alone (donor) or with a 
mixture of GFP AATG and either acceptor 01 *- 1 or acceptor 11 * '. Cultures 
cotransfected with GFP A TG , acceptor ScR and the I-Scel expression 
plasmid pCAG.I-Scel served as positive control. Flow cytometry was 
carried out 3 days post- transfection with 10 000 viable cells being 
analyzed per sample. (C) Live-cell imaging by phase-contrast and fluor- 
escence microscopy of monolayers of African green monkey kidney 
fibroblasts (COS-7) cotransfected with GFP AATG and either 
acceptor 11 *- 1 or acceptor 01 * 1 . Parallel cultures exposed to GFP AATG 
acceptor SoR and pCAG.I-Scel served as positive control for 
HR-dependent GFP reconstitution. Microscopic analysis was per- 
formed 3 days post-transfection. Original magnification: x40. 
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suggest that the inverted DNA repeat-induced HR is a 
mammalian cell type-independent phenomenon. 

Composite inverted DNA repeats are equally effective 
at stimulating homology-directed gene repair as simple 
palindromes 

To investigate the capacity of a structurally more 
'complex' or composite inverted DNA repeat to stimulate 
HR, we deployed acceptor ITR . Contrary to HeLa cell 
cultures that were transfected with donor GFP AATG or 
acceptor 1 ™ alone, those exposed simultaneously to 
GFP AATG and acceptor ITR readily revealed the presence 
of GFP-positive cells (Figure 6A). The percentages of 
these cells were similar to those observed in HeLa cell 
cultures upon cotransfection of QFP AATG and 
acceptor IR 1 (Figure 3). To rule out the possibility that 
DNA sequences prone to secondary structure formation, 
such as the AAV ITR, alter plasmid DNA transfection 
efficiency, we transfected HeLa cells with pDsRed or 
with pDsRed 2XITR . The former construct contains an 
expression unit based on the human EFla promoter, 
the DsRed.T4 reporter and the rabbit fi-globin pA, 
whereas the latter has this transcription unit flanked 
by AAV ITRs. Flow cytometric analysis at 72 hours 
post-transfection showed that cultures exposed to 



pDsRed z 



had a frequency of reporter-positive cells 



as well as an amount of DsRed.T4 protein very similar 
to those measured in cultures transfected with pDsRed. 
We conclude that the ITR sequences do not significantly 
affect the transfection efficiency of plasmid DNA 
(Supplementary Figure S3). 

To confirm, at the molecular level, the accurate repair of 
GFP ORFs following inverted repeat-mediated HR, we 
performed PCR analysis on extrachromosomal DNA 
isolated from HeLa cells transfected with GFP AATG 
alone or together with acceptor ITR . Extrachromosomal 
DNA extracted from HeLa cells exposed to acceptor ITR 
in combination with the nonhomologous donor construct 
RFP AATG served as an extra negative control. The PCR 
assay shown in Figure 6B is based on primer pairs 1/2 and 
1/3 to amplify PCR products that are diagnostic for the 
generation of GFP templates corrected by two-sided and 
one-sided HR, respectively. Although primers 1 and 2 also 
bind to the acceptor plasmid, their use did not yield any 
PCR products (Figure 6C) most likely due to the inability 
of the thermostable polymerase to read through the AAV 
ITR. As shown in Figure 6C (upper and middle panels), 
PCR fragments corresponding to reconstituted GFP 
ORFs were exclusively detected in the DNA sample 
from the cells that were cotransfected with donor 
QppAATG an( j acce p tor ITR Moreover, the results demon- 
strate that gene repair was brought about by two-sided as 
well as by one-sided HR (Figure 6C, lane 1 of upper panel 
and lane 1 of middle panel, respectively). Of note, ampli- 
fication reactions carried out on an in vitro mixture of 
QppAATG an( j acce p t0 r ITR plasmids did not yield any 
product, showing that the detection of specific amplicons 
in the DNA sample derived from HeLa cells cotransfected 
with GFP AATG and acceptor ITR was not an artifact but 
the result of genetic information exchange in vivo (not 



shown). Internal control PCR amplifications using 
primers 4 and 5 showed the presence of the homologous 
and the nonhomologous donor templates GFP AATG and 
pppAATG reS p ect i ve ]y 5 confirming the integrity of the 
extrachromosomal DNA following the isolation proced- 
ure (Figure 6C, lower panel). Next, we performed an in- 
dependent transfection experiment in HeLa cells followed 
by the same DNA isolation procedure and PCR assay. 
However, in this new experiment, an extra control was 
included. This consisted in using extrachromosomal 
DNA from cells transfected with acceptor ITR mixed, 
before PCR, with extrachromosomal DNA from cells 
transfected with donor GFP AATG . Data depicted in 
Supplementary Figure S4 shows, once again, the presence 
of the specific 1.2-kb amplicon exclusively in the sample 
corresponding to cells cotransfected with donor AATG and 

TTR 

acceptor 

The PCR products obtained with the aid of primer pair 
1/2 were inserted into a plasmid vector after which, nu- 
cleotide sequence analysis of 20 randomly selected DNA 
clones was carried out. From this analysis, it has resulted 
that 19 of these clones contained GFP ORFs without 
any mutations linking them to error-free HR events 
(Figure 6D, the 5 uppermost nucleotide sequences serve 
as examples of this analysis). The remaining clone had the 
GFP ORF disrupted at the initially engineered premature 
stop codon by AAV ITR-derived, heterologous DNA 
(Figure 6D; lowest nucleotide sequence) suggesting that 
it was the product of inverted repeat microhomology- 
directed recombination (25) or of error-prone NHEJ, 
possibly following center-break palindrome revision or 
cruciform resolution (2,3,26,27). Additionally, we studied 
the kinetics of homology-directed gene repair involving 
acceptor and acceptor ' and directly compared it 
with that of the conventional DSB-induced HR. To this 
end, HeLa cells were transfected with donor GFP AATG 
alone or together with acceptor IR1 , acceptor ITR or a 
mixture of acceptor ScR and pCAG.I-Scel. Results shown 
in Figure 6E reveal a time-dependent increase in the 
number of GFP-positive cells (Figure 6E, upper graph) 
and in the amount of reporter protein per GFP-positive 
cell (Figure 6E, lower graph) in all cultures cotransfected 
with acceptor plasmids and the donor GFP AATG con- 
struct. Interestingly, the time-dependent increase in the 
frequency and fluorescence intensity of GFP-positive 
cells was faster in cultures exposed to acceptor cR and 
pCAG.I-Scel than in those incubated with acceptor ITR 
or with acceptor 111 ' 1 . Moreover, no significant differences 
in both of the GFP-specific parameters were found at all 
time points tested in cell cultures exposed to GFP AATG 
and either acceptor or acceptor ' (Figure 6E). We 
postulate that the lower HR-inducing activity of the 
inverted DNA repeat sequences when compared with 
DSBs may relate to their transient nature. Perhaps, sec- 
ondary structures formed in vivo by certain inverted 
repeats and palindromes constitute 'facultative' or 'inter- 
mittent' DNA lesions leading to a sporadic engagement of 
the HR machinery. Another contributing factor may be 
the larger number of biochemical reactions necessary to 
process cruciform-like structures by HR than that that is 
necessary to repair DSBs. 
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Figure 6. Comparison of the ability of simple and composite inverted DNA repeats to trigger homology-directed gene repair in mammalian cells. (A) 
HeLa cells were transfected with GFP AAT or with acceptor 1 ™ or were cotransfected with GFP AATG plus acceptor 1 ™. Analysis of GFP expression 
by flow cytometry was carried out 4 days post-transfection on 10 000 viable cells per sample. Horizontal lines representing means corresponding to 
nine independent experiments. (B) Schematic representation of the PCR-based assay deployed to detect GFP ORFs repaired via homology-directed 
gene targeting. Primers 1 and 2 were designed to amplify templates resulting from two-sided HR whilst oligodeoxyribonucleotides 1 and 3 were used 
to specifically detect products of one-sided HR. Primer 1 recognizes the first 24 nts of the GFP ORF. Oligodeoxyribonucleotides 2 and 3 target 
sequences exclusively present in the acceptor and donor plasmid backbones, respectively. The PCR products of primers 4 and 5, which bind to the 
rabbit fl-globin pA (pG.pA) and SV40 pA (SV.pA), respectively, served as internal control for extrachromosomal DNA quality and quantity. 
Amplicon size (in bp) expected for each primer pair is indicated. For an explanation of the other symbols see the legend of Figure 1A. (C) PCR 

(continued) 
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Replication of DNA repeat-containing acceptor molecules 
does not significantly affect homology-directed gene repair 

SV40 is a mammalian double-stranded DNA virus with a 
circular genome whose replication has been extensively 
studied as a model for chromosomal nuclear DNA repli- 
cation in higher eukaryotes (28). The only viral as-acting 
element and trans-acting factor required for 
SV40-dependent DNA replication are the ori and the 
large T antigen protein, respectively. Thus, to investigate 
the impact of target template replication on DNA 
repeat-induced HR in mammalian cells, the SV40 ori 
was introduced at equivalent positions in acceptor ScR , 
acceptor DR1 , acceptor IR1 and acceptor spIR1 to generate 
the constructs acceptor ScR OR , acceptor° R1 ° RI , 
acceptor IR1 ° RI and acceptor spIRLORI , respectively. 
Next, each of these plasmids was individually transfected 
into the SV40 large T-expressing COS-7 cells together with 
the homologous donor construct GFP AA G or the 
nonhomologous donor plasmid RFP AATG . 
Extrachromosomal DNA isolated from these cells was 
treated with Dpnl to selectively digest the input prokary- 
otic DNA and with Xbal to linearize de novo synthesized 
acceptor DNA molecules and HR products. Southern blot 
analysis of the digestion products using a GF/'-specific 
probe revealed SV40 ori-dependent accumulation of de 
novo generated DNA molecules, demonstrating the repli- 
cation proficiency of the SV40 ori-containing acceptor 
plasmids in COS-7 cells (Figure 7A). Interestingly, 
acceptor DNA replication did not lead to a significant 
increase in homology-directed gene repair levels in any 
of the experimental setups tested (Figure 7B). Taken 
together, these data indicate that under the prevailing ex- 
perimental conditions, replication of acceptor DNA mol- 
ecules carrying a palindrome, a direct repeat or a spaced 
inverted repeat does not significantly enhance 
homology-directed gene repair. 



DISCUSSION 

Repetitive DNA sequences include not only single, direct 
and inverted repeats (2—4), like the ones investigated in 
this study, but also high-copy-number repetitive DNA 
tracts such as those corresponding to 1—4 bp microsatel- 
lites and 6-64 bp minisatellites (1). Microsatellites include 
the expandable trinucleotide repeats associated with 
neurodegenerative and neuromuscular disorders such as 
Huntington's and occulopharyngeal muscular dystrophy. 
Despite their diversity, diverse lines of evidence point to 



the acquisition of non-B conformations by DNA at these 
motifs (e.g. hairpins, cruciforms, G-quadruplex, Z-DNA 
and H-DNA) as a common culprit through which they 
exert their biological effects possibly in concert with 
DNA metabolic and other DNA-related processes (4). 
Indeed, an increasing number of experiments mainly 
carried out in bacterial and yeast model systems indicates 
that long single DNA repeats [i.e. > 150 bp; (3)] with the 
potential to form secondary structures (e.g. hairpins and 
cruciforms) can serve as targets for the shuffling and 
exchange of genetic information (2,3). 

The knowledge about the biological activity of different 
types of DNA repeats in relation to gene repair pathways 
(especially single repeats in the size range < 1 50 bp) and the 
putative role played in these processes by DNA replication 
is scant. This knowledge gap is particularly acute in cells 
of higher eukaryotes (2,3). In this study, we have devised 
an extrachromosomal functional read-out system based 
on pairs of complementary DNA templates carrying de- 
fective GFP-encoding sequences that can serve as sub- 
strates for intermolecular HR-dependent gene repair. 
This experimental system allowed us to investigate in a 
quantitative manner the effect of various types of single 
DNA repeats on the HR process in mammalian cells. 
Furthermore, by endowing acceptor DNA molecules 
with a eukaryotic origin of replication, we could probe 
in a strict manner the role of template DNA synthesis 
on repeat-induced homology-directed gene repair. We 
found that, in contrary to direct and spaced inverted 
repeats, both simple palindromes and composite inverted 
DNA repeats constitute targets for the HR pathway in 
mammalian cells. Induction of homology-directed gene 
repair was dependent on the arrangement and spacing of 
the repetitive DNA unit rather than on its nucleotide 
sequence. We also found that the presence of inverted 
DNA repeat sequences in target molecules rendered 
them susceptible to coordinated nicking by T7 endonucle- 
ase I, a bona fide four-way DNA branch resolving enzyme 
(29). These results are consistent with other in vitro data 
showing that hneform-to-cruciform transition in 
double-stranded DNA molecules relies on the presence 
of an inverted repeat and is negatively affected by 
intervening spacer sequences in a length-dependent 
manner (22-24). Thus, we demonstrate that nonspaced 
inverted DNA repeats per se can stimulate 
homology-directed gene repair in mammalian cells pre- 
sumably due to their capacity to form secondary struc- 
tures in vivo that can subsequently serve as direct targets 
for cellular structure-specific nucleases (Figure 4). These 



Figure 6. Continued 

analysis using primer pairs 1/2, 1/3 and 4/5 (upper, middle and lower panels, respectively) of extrachromosomal DNA isolated from HeLa cells 
cotransfected with acceptor ITR and the homologous donor plasmid GFP AATG (lane 1) or with acceptor 1 ™ and the nonhomologous donor plasmid 
RFP AATG (lane 2) or from HeLa cells transfected with GFP AATG alone (lane 3). Lane M, GeneRuler DNA Ladder Mix molecular weight marker. 
(D) Nucleotide sequence data of individual clones corresponding to PCR products obtained with primers 1 and 2. Clones #5, #9, #14, #16 and #18 
represent products of HR containing a repaired GFP ORF. The encircled ORF-correcting cytosine is derived from the donor plasmid. Clone #4 
corresponds to a rearranged acceptor template featuring 38 bp of the originally introduced AAV ITR (purple line above graph) and retaining the 
engineered stop codon. The G marked with the asterisk is derived from the original acceptor template. (E) Flow cytometric analysis at the indicated 
time points of HeLa cells transfected with the donor construct GFP AATG alone or together with either acceptor 11 *" 1 , acceptor 1 ™ or a mixture of 
acceptor ScR and the I-Scel-encoding plasmid pCAGI-Scel. Both the frequencies of GFP-positive cells (upper graph) as well as their average mean 
fluorescence intensities (MFI; lower graph) are presented. 



Nucleic Acids Research, 2012, Vol. 40, No. 5 1997 




•*- de novo replicated DNA 

•* input acceptor DNA 
< input donor DNA 



+ ++- + - + - 

---+- + - + 



+ + +- + - + - donor GFP 



AATG 



+ - + 



donor RFP 



AATG 



ScR IR.1 DR.1 splR.1 
SV40 ori-negative acceptors 



ScR IR.1 DR.1 splR.1 
SV40 ori-positive acceptors 




IR.1 splR.1 

loiloi 



+ 



+ 



acceptor DNA replication 

Figure 7. Testing the impact of target DNA synthesis on DNA repeat-mediated homology-directed gene repair. (A) SV40 ori-dependent DNA 
replication of acceptor constructs. Acceptor plasmids containing the test sequences ScR, IR.1, DR.1 or spIR.l and with or without SV40 ori were 
transfected into COS-7 cells together with the homologous donor construct GFP AATG or the nonhomologous donor plasmid RFP AATG . At 3 days 
post-transfection, extrachromosomal DNA was extracted and treated with Xbal and the prokaryotic DNA methylation pattern-sensitive restriction 
enzyme DpnI. After agarose gel electrophoresis, the resolved DNA was subjected to Southern blot analysis using a CfT-specific probe. 
DpnI-resistant, de novo replicated DNA, was detected only in samples of cells transfected with SV40 ori-positive acceptor plasmids (right-hand 
side upper panel). (B) Relative homology-directed gene repair frequencies in COS-7 cells transfected with GFP AATG , the indicated acceptor plasmids 
with (+) or without (— ) SV40 ori and in one case also pCAG.I-Scel. 



processes may eventually lead to the formation of DSBs 
that constitute canonical substrates for, amongst others, 
NHEJ- and HR-based allelic and non-allelic recombin- 
ation. Indeed, Inagaki and coworkers have recently 
shown in 293 cells by using a two-plasmid system 
together with a PCR-based assay that large secondary 
structure-forming palindromic AT-rich repeats 
(PATRRs), often associated with translocations in the 
human germ line, stimulate intermolecular rearrangements 
via a pathway likely to involve NHEJ (27). The impact 
of template DNA replication on the PATRR-specific 
rearrangements was, however, not investigated. 

Resolution of cruciform-like structures is thought to 
start with the introduction of single-strand breaks on 
opposite sites of the branch point followed by a ligation 
step resulting in the generation of hairpin-capped termini 
that can be further processed by nicking to generate 
"open" ends. Candidate resolving and processing enzym- 
atic activities are those of the first isolated bona fide mam- 
malian Holliday junction resolvase Genl (30) and Mrell 
(31), respectively. Other candidate resolvase is that corres- 
ponding to the SLX4 complex (32). Possible outcomes of 
such ectopic recombination processes include chromosom- 
al translocations and loss-of-heterozigosity. Related to 
this, in silico analysis of the human genome and experi- 
ments in yeast suggest that, during evolution, palindromes 
and inverted repeats with short spacers are counter 



selected compared with direct repeats and inverted 
repeats with long spacers (9). Indeed, without implying 
causality, more recent computer-aided phylogenetic 
sequence analyses revealed a correlation between DNA 
repeat pairs, NHEJ and non-allelic HR in the shaping of 
mammalian genome evolution (33). 

Finally, we also showed that, at least under condi- 
tions that do not disrupt processivity of DNA synthesis, 
replication of molecules harboring the direct or the 
inverted DNA repeats did not significantly increase the 
frequencies of HR-dependent gene repair events when 
compared with those measured in the absence of 
acceptor DNA replication. This finding on single 
DNA repeats adds to recent results indicating that, at 
least in the case of the high-copy-number trinucleotide 
repeat associated with Friedreich's ataxia GAA-TTC, 
DNA rearrangements can ensue in the absence of rep- 
lication (34). Other processes like DNA transcription 
and certain repair pathways such as the herein 
examined HR can provide for alternative mechanisms 
underlying DNA repeat-associated rearrangements as 
proposed elsewhere (1,35). Indeed, the fact that 
repeat-associated DNA instability can occur independ- 
ently of a replication-based mechanism (e.g. replication 
stalling or slippage) can also be circumstantially inferred 
from the significant age-dependent expansion of second- 
ary structure-forming trinucleotide repeats in 
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post-mitotic neurons of patients afflicted by certain 
neurodegenerative disorders (1). 

Current models of inverted repeat-driven secondary 
structure formation in vivo posit that palindromes or 
quasi-palindromes can, under torsional strain, transit 
from line-form to cruciform in double-stranded DNA 
via intrastrand annealing. On the other hand, spaced 
inverted repeats, although being also self-complementary, 
can only hybridize in the single-stranded form such as 
when a DNA replication fork advances through them 
and concomitantly gives rise to the Okazaki initiation 
zone (OIZ) in the lagging strand. Possibly, under these 
conditions, and depending on the length of the repeat/ 
spacer sequences relative to that of the OIZ, lagging 
strand self-annealing becomes thermodynamically favor- 
able resulting in the formation of hairpins that can stall 
DNA replication (2,36). Interestingly, we showed that rep- 
lication of DNA molecules containing the spaced inverted 
repeat spIR.l could not overcome their inability to 
promote homology-directed gene repair. Related to this, 
Voineagu and colleagues (36) have recently deployed a 
SV40 ori-based plasmid system and 2D agarose gel elec- 
trophoresis to demonstrate that ^4/w-derived 320-bp-long 
inverted repeats with no spacer or with a relatively short 
12-bp spacer lead to replisome stalling in COS-1 cells, 
whereas the same inverted repeat with a 52-bp spacer 
did not. The authors interpreted these results as a conse- 
quence of the OIZ size limit not allowing effective 
stem-loop hairpin formation by inverted DNA repeats 
with the larger spacer. Thus, on the basis of our results 
and those of Voineagu and coworkers and toward dissect- 
ing the inverted repeat parameters allowing hairpin 
assembly through the postulated lagging strand 
displacement-dependent mechanism, it will be interesting 
to evaluate different-sized repeat/spacer sequences and 
their relationship with replication fork stalling on one 
hand (36) and homology-directed gene repair on the 
other (this study). These experiments might help to 
define the rules underlying secondary structure formation, 
replisome stalling and the HR-inducing activity of 
inverted DNA repeats in mammalian cells. Finally, the 
functional genetic assay described herein might also be 
helpful in evaluating the effect of other types of DNA 
motifs/parameters on HR in higher eukaryotes and the 
contribution of cellular factors to this process. 
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